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Direct Variational Perspective Shape from 
Shading with Cartesian Depth Parametrisation 

Yong Chul Ju, Daniel Maurer, Michael BreuB and Andres Bruhn 


Abstract Most of today’s state-of-the-art methods for perspective shape from shad¬ 
ing are modelled in terms of partial differential equations (PDEs) of Hamilton- 
Jacobi type. To improve the robustness of such methods w.r.t. noise and missing 
data, first approaches have recently been proposed that seek to embed the underlying 
PDE into a variational framework with data and smoothness term. So far, however, 
such methods either make use of a radial depth parametrisation that makes the reg- 
ularisation hard to interpret from a geometrical viewpoint or they consider indirect 
smoothness terms that require additional consistency constraints to provide valid 
solutions. Moreover the minimisation of such frameworks is an intricate task, since 
the underlying energy is typically non-convex. In this chapter we address all three 
of the aforementioned issues. First, we propose a novel variational model that oper¬ 
ates directly on the Cartesian depth. In contrast to existing variational methods for 
perspective shape from shading this refers to both the data and the smoothness term. 
Moreover, we employ a direct second-order regulariser with edge-preservation prop¬ 
erty. This direct regulariser yields by construction valid solutions without requiring 
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additional consistency constraints. Finally, we also propose a novel coarse-to-fine 
minimisation framework based on an alternating explicit scheme. This framework 
allows us to avoid local minima during the minimisation and thus to improve the 
accuracy of the reconstruction. Experiments show the good quality of our model as 
well as the usefulness of the proposed numerical scheme. 


1 Introduction 

Shape from Shading (SfS) is a classic task in computer vision. Given information 
on light reflectance and illumination in a photographed scene, the aim of SfS is to 
compute based on the brightness variation the 3D structure of a depicted object from 
a single input image. SfS has a wide variety of applications, ranging from large scale 
problems such as astronomy ED or terrain reconstruction 0 to small scale tasks 
such as dentistry (2) or endoscopy ll34l f5T . 5211. 

Classical Methods. First approaches to SfS go back to 1951 and 1966, respectively, 
when Van Diggelen 02) and Rindfleisch ED used SfS techniques to reconstruct 
the surface of the moon. Later on in the 1970’s, Horn (24ll was the first one to tackle 
the SfS problem by solving a partial differential equation (PDE) approach. In 1981, 
he and Ikeuchi were also the first ones to model the SfS problem using a variational 
framework (28). The most prominent classical variational approach is given by the 
work of Horn and Brooks (26). Assuming a simple orthographic projection model, 
a light source at infinity as well as a Lambertian reflectance model, they proposed to 
compute the normals of the unknown surface as minimiser of an energy functional. 

Those first approaches, however, had several drawbacks. The model assumptions 
were very simple and mainly suitable in the context of astronomical applications. 
In fact, the use of an orthographic projection model with a light source located at 
infinity requires the distances between camera, light source and illuminated object 
to be huge. Also the depth was not estimated directly such that the SfS process re¬ 
quired a postprocessing step that performed a numerical integration of the estimated 
surface normals. Thereby, inconsistent gradient fields turned out to be a problem, so 
that extensions of the original model were required that tried to enforce this consis¬ 
tency during or after the estimation (22] E3 . Finally, in case of variational methods, 
the smoothness term was restricted to a quadratic regulariser (26] (28). While such 
standard smoothness terms simplify the minimisation of the underlying energy, they 
do not allow to preserve discontinuities in the depth and thus lead to oversmoothed 
solutions (35). For a detailed review of most of the classical methods the reader is 
referred to (20ll25ll27ll54). 

Perspective Shape from Shading. At the end of the 1990’s research mainly focused 
on novel concepts for formulating orthographic SfS such as viscosity solutions (43) 
and level set formulations (30) . However, for most applications results were not sat¬ 
isfactory (54) . In the early 2000’s, the situation changed completely. Inspired by 
the work of Okatani and Deguchi (34), independently, Prados and his co-workers 
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l38l [39 1 as well as two other research groups ED ED proposed to consider a per¬ 
spective camera model. Evidently, such a model is particularly appropriate for tasks 
that require the object to be relatively close to the camera such as e.g. in medi¬ 
cal endoscopy. In such cases the perspective effects dominate and an orthographic 
projection model would cause significant systematic errors as shown in l47l . Sec¬ 
ondly, Prados and colleagues proposed to shift the light source location from infinity 
to the camera centre which can be seen as a good approximation of a camera with 
photoflash. This made shape from shading attractive for a variety of photo-based ap¬ 
plications. Finally, also a physically motivated light attenuation term was introduced 
that models a quadratic fall-off due to the inverse square law. As discussed in (9[, 
the use of this term largely resolved the convex-concave ambiguity that was inher¬ 
ent to the classical orthographic model although some ambiguities are still present. 
Even the generalisation of such approaches to advanced reflectance models such as 
the Oren-Nayar f36l or the Phong reflectance model El have been recently inves¬ 
tigated |4ll481. 

However, this evolution of SfS models was accompanied by a different way of 
formulating the SfS problem. Instead of using variational methods, the perspective 
SfS problem was formulated in terms of hyperbolic PDEs lf39l . Although such PDE 
formulations allow for an efficient computation of the solution using fast marching 
schemes S3, they suffer from two inherent drawbacks: (i) On the one hand, they 
are prone to noise and missing data , since they do not rely on any form of regulari- 
sation or filling-in. This can be particularly problematic in the context of real-world 
images, (ii) On the other hand, it is difficult to extend the underlying model of such 
PDE-based schemes by additional constraints such as smoothness terms, multiple 
views, or additional light sources. While there have been recently some PDE-based 
approaches to photometric stereo |[32l . one has to take care of ensuring the unique¬ 
ness of the solution if the input data from multiple images is not consistent, cf. the 
discussion in |[33l . 

Variational Perspective Shape from Shading. Given the flexibility and robust¬ 
ness of variational methods, it is not surprising that recently researchers tried to 
close the evolutionary loop by integrating the perspective SfS model into a suitable 
variational framework. So far, however, there are only a few works in the litera¬ 
ture that deal with this recent idea. On the one hand, there is the work of Ju el 
al. |29l that embeds the PDE of Prados et al. l39l as data term into a variational 
model and complements it with a discontinuity-preserving second order smooth¬ 
ness term. However, since the approach penalises deviations from the PDE directly 
and uses a parametrisation in terms of the radial depth, deviations in both the data 
and the smoothness term are difficult to interpret geometrically or photometrically. 
On the other hand, there is the approach of Abdelrahim et al. S3 that formulates 
the data term in terms of brightness differences and makes use of a Cartesian depth 
parametrisation. While the corresponding energy functional is thus more meaningful 
from a geometric and photometric viewpoint, it defines smoothness based on surface 
normals and thus needs an additional integrability constraint. Moreover, the corre¬ 
sponding smoothness term is restricted to a simple homogeneous regulariser that 
does not allow to preserve object edges during the reconstruction. Finally, there are 
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the works of Zhang et al. j55l and Wu et al. {52l that also make use of a Cartesian 
depth parametrisation but rely on an indirect estimation using auxiliary variables. 
While the approach of Zhang et al. l55l resolves the resulting consistency problem 
by considering an integrability constraint, the method of Wu et al. Il52l repeatedly 
integrates the surface normals during computation to ensure valid solutions. More¬ 
over, both approaches use derivations for their surface normals that are based on the 
orthographic projection model of Horn and Brooks [27 j. Unfortunately, the resulting 
models are thus only valid in case of weak perspective distortions. 

A final issue that is common to all four of the aforementioned works is the diffi¬ 
culty of minimising the underlying energy. Since this energy is non-convex, two of 
the methods rely on initialisations provided by closely related PDE-based SfS ap¬ 
proaches @ 131 - This, however, contradicts the idea of introducing robustness into 
the estimation - in particular in the presence of noise or missing data. In contrast, 
the other two methods estimate the solution from scratch ||29ll52l . However, those 
methods do not provide any quantitative assessment of the reconstruction quality. 

Let us summarise: While from a modelling viewpoint, it would be desirable to 
design a variational model that directly solves for the Cartesian depth without the 
need of integrability constraints or repeated integrations steps, it would be helpful 
from an optimisation viewpoint to develop a minimisation scheme that neither de¬ 
pends on the solution of other SfS techniques as in @131 nor requires an accurate 
initialisation to produce meaningful results. 

Our Contributions. In this book chapter we contribute to the field of variational 
SfS in three ways: (i) First, we consider a variational model for perspective SfS that 
makes use of a Cartesian depth parametrisation and an edge-preserving Cartesian 
depth regularisation. By penalising deviations from the image brightness in the data 
term and regularising the Cartesian depth in the smoothness term directly, we obtain 
an approach that is geometrically and photometrically meaningful. In this context, 
we also point out a popular mistake in the derivation of the surface normal and 
show two different ways to derive the normal correctly, (ii) Our method is a direct 
approach to depth computation, i.e. it does not yield gradient fields that need to 
be integrated in a subsequent step, nor do we employ integrability constraints, (iii) 
Apart from the novel model, we also propose a novel minimisation strategy. By 
embedding an alternating explicit scheme into a coarse-to-fine scheme, we obtain 
an optimisation framework that allows to obtain significantly better results than a 
traditional explicit scheme. Experiments with synthetic and real-world images show 
the good quality of our reconstructions and the advantages of our numerical scheme. 

Organisation of the Chapter. In Section 2 we propose a novel PDE-based model 
for perspective SfS that is based on a Cartesian parametrisation of the depth. In 
Section 3 we then embed this PDE into a variational framework with appropriate 
second order smoothness term. Details on the minimisation and the discretisation 
are provided in Section 4, while Section [5] comments on the integration of intrinsic 
camera parameters. Finally, a detailed evaluation of our approach is presented in 
Section[6] The paper concludes with a summary in Section[7] 
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Fig. 1 Relation between the radial depth factor u(x) (quotient between green and blue distance) 
that denotes the depth in multiples of the focal length f and the Cartesian depth z(x) (red distance). 


2 Perspective SfS with Cartesian Depth Parametrisation 


In this section, we introduce a novel PDE-based SfS model that is parametrised in 
terms of the Cartesian depth. In contrast to most existing SfS models that estimate 
the radial depth or multiples thereof, such a Cartesian parametrisation expresses the 
unknown surface directly in terms of the Euclidean distance along the z-axis, which 
is the axis orthogonal to the image plane. 

Parametrisation of the Surface. The starting point for our new model is formed by 
the classical PDE-approach of Prados et al. If39l which is originally parametrised in 
terms of the radial depth. Key assumptions of this SfS model are that a point light 
source is located at the optical centre of a perspective camera and that the surface 
reflectance is Lambertian with uniform albedo that is fixed to one. The unknown 
surface y : f2 x —► R 3 can then be described as 


SC ( x , m ( x )) 



x 

y 

-f 


X := (■ X,y) j 



(1) 


where x = (x.y ) T £ £2 X is the position in the closure f2 x of the rectangular image 
domain f2 x C M 2 , f denotes the focal length of the camera and w(x) is a multiple of 
f that describes the radial distance (depth) of the surface from the camera centre. 

Since the third component in Eq. ([T} corresponds to the negative Cartesian depth 
Z, we can derive the following relationship to the radial depth u f 


z(x) = —= »(x)f 13 Q(x)u(x) f, 

V |x| 2 + f 2 


(2) 
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where Q(x) denotes a spatially variant conversion factor given by 


e(*) 


f 

V \ x \ 2 + f2 


(3) 


This relation is illustrated in Figure [I] 

Plugging Eq. (|3]i into Eq. <(TJ, we then obtain the parametrisation of the original 
surface -9 with respect to the Cartesian depth z. 


^(x,z(x)) iQ(x)w(x) 


X 

y 

-f 


- G(x) 




'z(x)x' 

X 


f 

y 

= 

z(x)y 

-f 


f 



--z(x)- 


(4) 


Brightness Equation. After we have parametrised the original surface in terms of 
the Cartesian depth, let us now derive the resulting brightness equation that relates 
the local orientation of the surface to the image brightness. Assuming a Lambertian 
reflectance model and a quadratic light attenuation term that follows the inverse 
square law, we obtain the following general brightness equation ll39l : 


7(x) 




(5) 


where I is the recorded image, n is the surface normal vector, L stands for the 
normalised light direction vector, and r is the (radial) distance of the light source to 
the surface. Knowing that r = f u and using Eq. 0 we can express the quadratic 
light attenuation term using the Cartesian depth z 


r(x) = f m(x) 


z(x) 

<2(x) 


1 _ 6(x) 2 

r(x) 2 z(x) 2 ' 


( 6 ) 


What remains to be computed in terms of the Cartesian depth are the surface normal 
n and the light direction vector L, respectively. 

Surface Normal. Let us start by deriving the surface normal. Since the surface 
normal is the normal vector of the tangent plane, we first have to compute the partial 
derivatives of the surface in Eq. (|4| in x- and y-direction, respectively 



r ZxX + z-i 


r ZyX i 


f 


f 

^c(x,z) = 

f 

, y y {x,z) = 

z y y + z 
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. -Z* . 
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Here and for the whole paper we dropped the spatial dependency of z, z x and z, y on 
x for the sake of clarity. Taking the cross-product then yields the direction of the 
surface normal 


n(x) = y x {x,z) x y'yfaz) 


z x z 


f 

ZyZ 

~T 


z [(Vz-x) +z] 

i 2 


( 8 ) 


Light Direction. Let us now turn towards the computation of the light direction. 
Since the light source is assumed to be located in the camera centre which coincides 
with the origin of the coordinate system, the direction of the light rays and the 
direction of the optical rays coincide (up to sign). Hence, the light direction can just 
be read off Eq. <[T]) as 


L(x) 


1 

\/\x\ 2 + f 2 


—X 

-y 

f 


(9) 


PDE-Based Model. By plugging the surface normal from Eq. (j8j and the light di¬ 
rection from Eq. |9j into the brightness equation ([5]) we finally obtain our perspective 
SfS model with the new Cartesian depth parametrisation 

Q 3 

I - ^ = 0 . ( 10 ) 

z\Jf 2 |Vz| 2 + [(Vz-x) +z\ 2 

Here and for the whole paper we dropped the spatial dependency of I and Q on 
x for the sake of clarity. 

The main properties of our new model ( fT0| are naturally inherited from the orig¬ 
inal PDE f39l : (i) Eq. ( [T()| i still belongs to the class of Hamilton-Jacobi equations 
(HJEs) which have been intensively studied in the SfS literature, (ii) Therefore, well- 
posedness can be achieved in the viscosity sense HE! HD. (iii) Proper numerical 
discretisations must be considered when solving the HJE. 

Let us note that the framework of viscosity solutions is a natural setting for HJEs 
such as Eq. ©■ The basic idea behind the notion of viscosity solutions is to add 
a (typically, second order) regularisation term to the PDE and study the solution 
as this term goes to zero. This proceeding yields desirable stability properties and 
enables to consider even solutions with non-differentiable features like e.g. kinks. 
We refer the interested reader to 0131 for studying properties of viscosity solutions 
and to El for their use in computer vision. 

Furthermore, please note that our model can be seen as a generalisation of 
the PDE-based approach in l56l that already makes use of the Cartesian depth 
parametrisation, but does not yet consider the light attenuation term from physics. 
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3 Variational Model for Perspective SfS with Cartesian Depth 
Parametrisation 

So far we have derived a novel PDE-based model for perspective SfS with Cartesian 
depth parametrisation. Let us now discuss how this model can be integrated into a 
variational framework with smoothness term. 

Variational Model. To this end, we follow the idea from ||29l and use a quadratic 
error term based on our novel PDE as data term which is complemented with a suit¬ 
able second order regularises More precisely, we propose to compute the Cartesian 
depth z as minimiser of the following energy functional 

E(z)= [_ c(x) D(x,z,Vz) + a S(Hess(z)) dx, (11) 

Jn x v -•' "-v- 

Data term Smoothness term 

where D is the data term, S is the smoothness term, c : x G Q x CR 2 4 [0,1] is a 
confidence function and a € R ’ is a regularisation parameter that steers the degree 
of smoothness of the solution. As mentioned before our data term is based on a 
quadratic formulation that penalises deviations from our novel PDE. It is given by 

a2> 

with 

W(x,z,Vz) = \Jf 2 |Vz| 2 + [(Vz-x) +z} 2 ■ (13) 

As smoothness term, we propose to use the following subquadratic and thus edge¬ 
preserving second-order regulariser based on the Frobenius norm of the Hessian 

5(Hess(z)) = *P(||Hess(z)||^) =*P(zi + 2z^+z^) (14) 

where T is the Charbonnier function D3 

*F(s 2 )=2A 2 /l + £ (15) 

with contrast parameter A. Such higher-order smoothness terms have already been 
successfully applied in the context of perspective SfS parametrised in terms of the 
radial depth ll29ll . orthographic SfS l49l . image denoising ED. optical lithography 
ED and motion estimation tm. Finally, the use of the confidence function c in the 
data term allows to exclude unreliable image regions which have been identified a 
priori, e.g. by a texture detector or by a background segmentation algorithm. Such 
functions are particularly useful in the context of real-world images that contain 
texture, noise, or missing data mm. 
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Table 1 Comparison of the literature on variational models for perspective shape from shading. 



Zhang et al. 

ED 

Wu et al 

ED 

Abdelrahim et al. 

m 

Ju et al. 

m 

Our 

Work 

Parametrisation 

Cartesian 

depth 

Cartesian 

depth 

Cartesian 

depth 

radial 

depth 

Cartesian 

depth 

Reprojection Error 
as Data Term 

/ 

/ 

/ 

- 

/ 

Light Attenuation Factor 

- 

/ 

Z 1 

/ 

/ 

Correct Surface Normal 

_ 2 

_ 2 

/ 3 

/ 

/ 

Regularisation 

Cartesian 

Cartesian 

Cartesian 

radial 

Cartesian 

depth 

depth 

surface normal 

depth 

depth 

Edge Preservation 

- 

- 

- 

Z 

/ 

No Integrability Term 

- 

_4 

- 

Z 

Z 

Direct Estimation 5 

- 

- 

Z 

/ 

Z 


factor not expressed in terms of the Cartesian depth 

2 see explanation in appendix 

3 no details given in the paper but derivations shown in Q] 

4 integrability constraint realised via repeated integration of surface normals 

5 depth is computed without extra variables for surface normals 


Properties. Our variational model from Eq. ( p~T| ) has the following distinct features: 

(i) Since the data term in Eq. ([12} is inherited from Eq. ( fTO} , the perspective camera 
projection is already taken into account. Moreover, since the reprojection error is 
penalised in the data term, deviations have a photometric interpretation. 

(ii) Since the regulariser is applied directly to the Cartesian depth, also deviations 
from smoothness become now more meaningful than in the case of a radial depth 
parametrisation. In particular, they can be interpreted geometrically. 

(iii) Moreover, in contrast to most existing approaches, the regulariser is able to pre¬ 
serve edges in the reconstruction despite of the regularisation effect. 

(iv) Unreliable regions can be excluded from the data term via a confidence function 
such that the smoothness term takes over and fills in information from the neigh¬ 
bourhood. This can be advantageous in the context of texture, noise, or missing 
data. Please note that in contrast to l29l . we always guarantee a fixed amount of 
regularisation by not restricting the smoothness term to unreliable locations. 

(v) The depth of the surface is directly computed since we minimise for the unknown 
depth z in Eq. O- This is in contrast to most variational methods that estimate 
the depth in two steps, see e.g. | Kjl [22l EB'| where first the surface normals are 
computed by a variational model and then the depth is determined by integration. 
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(vi) The solution given by the model fulfils the integrability constraint per construc¬ 
tion since we solve for z and use z X y = Z yx in the smoothness term. Otherwise, 
such as in 0, an additional integrability term would be needed to encourage 
valid solutions. 

(vii) Another advantage of the new parametrisation is that it allows a straightforward 
combination with other reconstruction methods such as stereo l42l or scene flow 
estimation 01, since such approaches typically make use of the same Cartesian 
parametrisation and thus could be easily integrated into a joint framework. 

(viii) A final advantage is the fact that the approach could easily be extended to multi¬ 
ple views, since transformations between the views are simpler if the approach is 
parametrised in terms of the Cartesian depth instead of the radial depth. 

To make the difference of our model to other variational approaches from the liter¬ 
ature explicit, the features of the different methods are compared in Table [T| 

4 Minimisation 

Let us now discuss the minimisation of the proposed energy. To this end, we will first 
derive the associated Euler-Lagrange equation and then discuss its discretisation. 
Finally, we will sketch a coarse-to-fine minimisation strategy with an alternating 
explicit scheme to solve the resulting nonlinear equations. 

Euler-Lagrange Equation. The calculus of variations lfl5l tells us that the min¬ 
imiser z of our energy in Eq. ( fTT) has to fulfil the corresponding Euler-Lagrange 
equation. Omitting the dependencies on all variables in order to ease the readability, 
this equation is given by 



(16) 


=o =o 


0 



= 0 =0 =0 


c (^* dx dy ^a) + dx 2 + 2 dxdy + dy 2 


where we exploited the fact that 



( 17 ) 
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On a structural level, this Euler-Lagrange equation is somewhat more complicated 
than its counterparts for indirect methods in |52l[55 1. Such indirect methods model 
the surface normal using auxiliary variables p = z x and q = z y and thus do not have 
the additional data term contributions [Z)]_ and ^ [D]^. 

Let us now take a closer look at all the individual terms that occur in Eq. 0 
After some computations we obtain 


d 

dx 


d 

dy 


P] z = 2 (I- 


[-= 


ei 

zW 

Q 3 


Q 3 , Q 3 


z 2 W zW 2 


[W] : 
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l D k = 


2 1 - 


2 1 - 


~ zW, 

/ zW 

e 3 \ 

Q 3 

zW ) 

zW 3 

q 3 \ 

Q 3 

zW J 

zW 3 

q 3 \ 

Q 3 

zW J 

zW 3 


Q 3 


1 Vz-x + z 

z + W 2 


m x 


in 


J y 


as well as 


d 2 

dx 2 

d 2 


= 2 


,r- 

dx 2 - 
d 2 


f /, (||Hess(z)||^) Zxx 


2 dxy^*» 4 dxy L 


f /, (||HeSs(z)||f ) Zxy 


d 2 


dy 2 Zyy dy 2 . 


[ 5 L =2 3 L 2 f"(||Hess(z)||^)z yy 


(18) 


(19) 


( 20 ) 


( 21 ) 

( 22 ) 

(23) 


where the derivative of the penaliser function (.v 2 ) reads 





(24) 


While the contributions of the data term are related to the influence of z and Vz on 
the brightness equation, the contributions of the smoothness term define an edge¬ 
preserving fourth-order diffusion process. This becomes explicit as follows: Since 
f"(.s 2 ) becomes small for large values of s 2 , this reduces the effect of the smooth¬ 
ing at locations with high curvature, i.e. where ||Hess(z)||^ is large. After we have 
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derived the resulting Euler-Lagrange equation, let us now discuss how this equation 
can be discretised appropriately. 

Discretisation. In order to discretise the contributions of the data term given by 
Eqs. ( fi~8j ) - ( |20| ), we employ the upwind scheme from ll43l in view of the hyperbolic 
nature of the underlying PDE. In ID, the corresponding upwind discretisation reads 


z JC «max(D z,-D + z, 0) , (25) 

with 


Zi Zi-l , Zi +1 Zi rtc\ 

D z= —: - and Zr z = — -, (26) 

h x h x 

where h x denotes the grid size. Please note that in contrast to upwind schemes for 
eikonal equations B31 that typically approximate only the magnitude of the gradi¬ 
ent, the sign matters in our case, such that we have to choose 


( D+ Z if Z x = -D+z , „ 

* \ Z x otherwise. 

This selects the actual forward difference, if the second argument in ( [25] ) is the 
maximum EE). This scheme can be extended in a straightforward way to 2D. For 
discretising the contributions of the smoothness term, a standard central difference 
scheme is used. 

Since it is difficult to discretise the Euler-Lagrange equation directly, we followed 
a first discretise then optimise scheme. To this end, we used the aforementioned 
finite difference approximations to discretise the energy in ( |TT] i applying the upwind 
scheme for the data term and a central difference approximation for the smoothness 
term. Then, by computing the derivatives of the discrete energy we obtain a proper 
discretisation for the Euler-Lagrange equation. 

Finally, by using the Euler forward time discretisation method 


z t « 



(28) 


with T being a time step size, we can reformulate the solution of Eq. © as the 
steady state of the corresponding evolution equation in artificial time. Thus we ob¬ 
tain the following explicit scheme 


~n +1 _ _;i 

--— + EL" = 0 <^> z n+l =z!' -tEL" , (29) 


where EL" is the discretisation of the Euler-Lagrange equation evaluated at time n. 
Please note that this discretisation may change over time, since we re-discretised the 
energy in each iteration by adapting the direction of the discretisation of the upwind 
scheme (forward, backward, no contribution) based on evaluating Eqs. @-l|27) 
for the result of the previous time step. In that sense we use a lagged discretisation 
approach, where the discretisation is updated in each iteration. 
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Coarse-to-Fine Approach. Since the underlying energy functional is highly non- 
convex, the proposed explicit scheme may get trapped in local minima. To tackle 
this problem, we propose to embed the estimation into a coarse-to-fine framework. 
Starting from a very coarse resolution, we successively refine the input image while 
repeatedly reconstructing the surface. Thereby, solutions from coarser levels serve 
as initialisation for the finer scales. Similar hierarchical schemes have already been 
successfully applied to many other problems in computer vision; see e.g. ©HD- 

Apart from improving the quality of the results by avoiding local minima, coarse- 
to-fine schemes also render the estimation more robust w.r.t. the choice of the ini¬ 
tialisation. In fact, if sufficiently many resolution levels were used, we could hardly 
observe any impact of the initialisation on the quality of the final results. Since a 
good initial guess can still be useful to speed up the computation, we propose to 
initialise the depth by pointwise solving the data term in Eq. ( fl2| for Vz = 0 

D(x,z, 0) = 0 =► (30) 

This can be seen as an efficient compromise between using the full model which is 
evidently not feasible and only considering the inverse square law, i.e. z = 1/a//(x), 
which completely neglects the effect of the surface orientation and thus actually 
provides a local upper bound for the correct depth. In any case, in contrast to other 
variational SfS methods from the literature, our technique does not have to rely on 
initialisations from non-variational SfS approaches © [551 or surface integration 
methods li52l to provide meaningful results. 

Let us now discuss the details of our coarse-to-fine approach. To this end, we 
introduce the parameter T] that specifies the downsampling factor between two con¬ 
secutive resolution levels and that is typically chosen in the interval (0.5,1). Then 
the grid size at level k of our coarse-to-fine approach can be computed as 

h k x = h x -T]-\ hy = h y - r]~ k . (31) 

where k = 0 is the original resolution and k = k max is the coarsest level. This tells 
us that the grid size becomes larger at coarser scales which intuitively makes sense, 
since the size of the image plane remains constant while the number of pixels de¬ 
creases. At the same time, however, this increase of the grid size leads to a major 
problem: Since the contributions of the smoothness term given by Eqs. 
involve fourth-order derivatives that scale proportionally to 1 //i 4 , the strength of the 
regularisation actually decreases with 1 ] 4/ on coarser scales. In order to compensate 
for this effect, we thus propose to scale the smoothness weight a according to 

a k = r)~ 4k -a. (32) 

This guarantees a similar amount of regularisation for all resolution levels. 

Alternating Explicit Scheme. Finally, we observed in our experiments that the 
terms in Eqs. (p~9l> and ([20]) that refer to the influence of the depth gradient V/ on 
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the brightness equation require to select the time step size T rather small. In partic¬ 
ular, these terms do not have a weighting parameter such as the smoothness term 
that can be adjusted appropriately. As a consequence, the minimisation typically 
needs several thousands or even millions of iterations. To counter this problem, we 
propose the following alternating estimation strategy at each resolution level: For 
a fixed number of iterations n, instead of performing n iterations using the origi¬ 
nal explicit scheme, we propose an alternating iterative scheme that first does n/2 
iterations with a simplified explicit scheme neglecting the two terms in Eqs. © 
and ( |20| i, followed by n/2 iterations with the entire explicit scheme. Since the ne¬ 
glected terms are based on second-order derivatives and the remaining terms did 
not strongly affect the convergence, we empirically found out that we can choose 
the time step size approximately min(/i~ 2 ,/!~ 2 ) times larger for the first n/2 itera¬ 
tions (given that h x ,h y <C 1). In our experiments this leads to speed-ups of about one 
to four orders of magnitude. Moreover, in most cases, even the simplified scheme 
was sufficient to achieve excellent results. Thereby one should note that, from a 
numerical viewpoint, the simplified scheme can be understood as an optimisation 
method for a series of energy functionals of type of Eq. ©• where the gradient Vz 
is lagging and thus has no direct influence on the optimisation. 


5 Intrinsic Parameters 

So far we have derived a variational model for perspective SfS with Cartesian depth 
parametrisation that is given in terms of image coordinates. Let us now discuss how 
the model and the minimisation has to be adapted if we additionally consider the in¬ 
trinsic camera parameters, i.e. if we express the model in terms of pixel coordinates. 


Coordinate Transformation. Let the corresponding calibration matrix be given by 


K = 


f/h x 0 ci 

0 f/ky C2 

0 0 1 


(33) 


where (ci,C 2 ) t denotes the location of the focal point, and h x and h y is the grid size 
in x- and y-direction, respectively li23l . Knowing this matrix allows us to reformulate 
the image coordinates x = (x,y) T of our original model in terms of pixel coordinates 
a = (a,b) J . The corresponding transformation is given by 


a 

b 

i 

= K - 

-C 

X 

y 


X 

y 

= tr' 

a 

b 

-1 

I 

-f 


-f 


-1 


where one has to take care that the image plane is at distance f of the camera centre. 
Plugging Eq. (|33|) into Eq. <f34]> then yields 
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x(a ) 


h x 0 ' 

a 


h x c\ 

y(b)_ 


0 hy 

b 


hyc 2 


( 35 ) 


Variational Model. Now we are in the position to reformulate our entire model in 
terms of pixel coordinates. Substituting Eq. ( [35] ) into our original energy and trans¬ 
forming the integration domain Q a = x 1 (32 x ) accordingly, we obtain the following 
variational model expressed in terms of pixel coordinates 

E (z(x(a))) = [_ c(x(a)) D(x(a),z(x(a)),Vz(x(a)))-l-aS(Hess(z)(x(a))) da. 

Jn a '-v-- s -v-' 

Data term Smoothness term 

(36) 

Please note that we omitted the substitution factor given by |det(/(x(a)))|, where J 
is the Jacobian, since this factor is constant and thus does not change the minimiser 
of our energy. Let us now derive the corresponding Euler-Lagrange equation for our 
novel model expressed in terms of pixel coordinates. 

Euler-Lagrange Equation. Analogously to Eq. ( fT7] > we drop the dependencies on 
all variables and obtain the following Euler-Lagrange equation 

( < 9 2 d~ d~ \ 

+ “ \dcfi +2 dadb ^ + ) 

(d" d~ d~ \ 

+ 0C (a? +2 dxdy + dy 2 ) ’ 

where we exploited the following relation between derivatives in pixel and image 
coordinates due to Eq. © 

d_ _ d_ d 

da x dx ’ db 

d 1 d 

d Zaa hx^x d Zxx 

The equality between Eq. ( |37[ and Eq. ( |38[ > shows that the Euler-Lagrange equa¬ 
tions of our models in pixel and image coordinates are basically identical. One only 
has to parametrise the terms l[T8])-([23} that have been originally derived in image 
coordinates using the coordinate transform in Eq. ( [35[ . Apart from that, the dis¬ 
cretisation can be performed in accordance with our explanations from the previous 
section. In this context, the grid size is given by the intrinsic parameters h x and h y . 
Moreover, one has to adapt the camera matrix K at each level of the coarse-to-fine 
scheme. This requires to scale both the grid size and the principal point (ci,C 2 ) t . 


d 

d 

1 

d 

d 

1 d 


ky dy ’ 

1 * 

I -ss 

II 

|r& 

dz x 1 

dzb 

hy dz y ’ 

( 39 ) 

d 
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h x hy dzxy ’ 

dzbb 

hy hy 

dZyy ' 


( 37 ) 


( 38 ) 
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6 Evaluation 


Test Images and Error Measures. In order to evaluate our novel approach we 
make use of four synthetic images with ground truth that fulfil the underlying as¬ 
sumptions regarding reflectance and illumination. This allows us to compute two 
error measures: one with respect to the reconstructed surface and the other one with 
respect to reprojected image. The first error measure is the relative surface error 
(RSE) of a point wise computed Euclidean distance between the computed surface 
■9‘ and the ground truth surface It is given by 


RSE = 


L s J^(x(a))-^(x(a))| 

L^|^ gt (x(a))| 


(41) 


where the normalisation allows to determine the reconstruction error relative to the 
ground truth shape. This in turn makes errors of differently scaled surfaces com¬ 
parable. The second error measure is the relative image error (RIE) between the 
reprojected image I and the given input image 7 gt . It is defined as follows 


RIE = 


It2 a l 7 ( x ( a ))- /St (x(a))l 

L n J/ gt (x( a)) | 


(42) 


This time, however, the normalisation is performed with respect to the brightness of 
the input image to make reprojection results for input images with different bright¬ 
ness scale comparable. Summarising: While the first measure reflects how well the 
reconstruction matches the ground truth surface, the second measure determines 
how well the reprojection fits the input data. 

Let us now discuss the considered test images which are depicted in Fig. [2]in de¬ 
tail. The first synthetic test image Sombrero was generated from a known parametric 
surface, using the following equation 

Z(X, Y) = 0.5 Sm ^y^ ] + 1 -7, r(X,Y)= \/(10X) 2 + (10F) 2 . (43) 

The image was rendered using Eq. 0 at a size of 256 x 256 pixels, where the fo¬ 
cal length was set f = 1 , the grid size was chosen to be h x = h y = 1 /200 and the 
principal point was fixed at c = (128,128) T . The second test image Suzanne was 
generated using the open-source software Blender HD. In this context, the Z-buffer 
of the rendering path and the corresponding intrinsic parameters (f = 35, h x = 1/16, 
h y = 9/128, c = (256,128) T ) were extracted and the final image was rendered at 
a size of 512x256 using Eq. 0 as before. The other two test images Stanford 
Bunny and Dragon have been computed likewise using 3-D models obtained from 
the Stanford 3D scanning repository (58). For them a size of 256x256 pixels and the 
intrinsic parameters (f = 35, h x = 1 /8, h y = 9/128, c = (128,128) T ) were chosen. 
Finally, all images were saved as 8-bit grey-value images. 
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Let us finally comment on the selection of the parameters in our experiments. 
In order to keep the number of parameters low, we choose a preferred standard set 
of parameters for all the following experiments, unless otherwise stated: A down- 
sampling factor of 77 = 0.8 for the coarse-to-fine approach, n = 10 6 solver iterations 
on each coarse-to-fine level and a contrast parameter of A = 10“ 3 . Moreover, the 
time step size T provided in the different experiments always refers to the simplified 
explicit scheme. The time step size for the full explicit scheme is min(/ 2 2 ,/ 2 2 ) times 
smaller. 



Fig. 2 Synthetic test images. From left to right: Sombrero, Suzanne, Stanford Bunny and Dragon. 


Results on Synthetic Test Images. In our first experiment we evaluate the recon¬ 
struction quality of our novel approach. To this end, we applied our perspective SfS 
algorithm to all four of the previously discussed test images and compared the repro¬ 
jected image and the reconstruction to the ground truth; see Fig.[3]and Fig. [4] Herein, 
the depth values are colour-coded in such a way that depth increases from red via 
green to blue. As one can see, both the reprojected image as well as the estimated 
depth values coincide very well with the ground truth. This is also confirmed by the 
corresponding surface error maps in Fig. [5] Indeed, only small differences for the 
Stanford bunny (right paw) and the Dragon (tail tip) are visible. As a consequence 
both error measures which are listed in Table [2] are very small. Moreover, one can 
see that the proposed subquadratic penaliser outperforms a quadratic smoothness 
term in most cases. Only for the Sombrero which has a rather smooth surface, the 
reconstruction error is smaller in the quadratic case. 

Influence of the Regularisation. In our second experiment we investigate the influ¬ 
ence of the regularisation on the quality of the reconstruction and its reprojection. To 
this end, we consider the Sombrero test image and vary the regularisation parameter 
a while the other parameters are kept fixed (t = 0.001, n = 10 4 ). The outcome is 
visualised in Fig. [ 6 ] While the reprojection related error measure (RIE) increases for 
a moderate amount of regularisation but is overall very low, the surface related error 
measure (RSE) decreases by almost a factor three (from 4.4 x 10 -2 to 1.7 X 10 2 ). 
This, however, is not surprising, since the computed surface typically exhibits some 
form of smoothness and thus benefits from a moderate amount of regularisation. 
Since the actual purpose of SfS is to find the correct surface, this shows that the 
regularisation may have an overall positive impact on the quality of the results. 



18 


Yong Chul Ju, Daniel Maurer, Michael BreuB and Andres Bruhn 



Fig. 3 First column, from top to bottom: Input image, reprojected image, ground truth depth, 
computed depth for the Sombrero test image (a = 7.5 x 10~ 5 , T = 10 2 , n = 10 6 ). Second column: 
Ditto for the Stanford Bunny test image (a = 7.5 x 10 5 , 1 = 10 3 , n = 10 6 ). Third column: Ditto 
for the Dragon test image (a = 7.5 x 10~ 8 , T = 10~ 3 , n = 10 6 ). 
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Fig. 4 First row, from left to right: Input image and ground truth depth of the Suzanne test image. 
Second row: Reprojected image and the computed depth (a = 10 7 , t = 10~ 3 , n = 10 6 ). 



Fig. 5 Surface error maps. From left to right: Stanford Bunny, Dragon and Suzanne. Red denotes 
errors above 1 percent, where the intensity encodes the error magnitude. White denotes errors 
below 1 percent. The Sombrero is not shown, since the error is below 1 percent everywhere. 

Table 2 Results for our approach with quadratic and subquadratic penaliser. Error measures are 
given in terms of the relative surface error (RSE) and the relative image error (RIE). Best results 
for each test image are highlighted boldface. Same parameters as in Fig. 0and Fig. [4] 



quadratic 

subquadratic 

runtime 

RSE 

RIE 

RSE 

RIE 

Sombrero 

0.00208 

0.00694 

0.00318 

0.00209 

29113s 

Stanford Bunny 

0.00546 

0.00015 

0.00439 

0.00007 

23969s 

Dragon 

0.01376 

0.00028 

0.01376 

0.00028 

25350s 

Suzanne 

0.00392 

0.00011 

0.00251 

0.00002 

48395s 
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a 10“ 4 

Fig. 6 Impact of the amount of regularisation on the reconstruction quality and the reprojection 
accuracy for the Sombrero test image. 


Table 3 Impact of different initialisations on the reconstruction quality and reprojection accuracy 
for the Stanford Bunny (a = 7.5 x 10~ 5 , T = 10 3 , n = 10 6 ). 



initial error 

after computation 

RSE 

RIE 

RSE 

RIE 

plane (z = 1) 

0.25804 

1.63174 

0.00439 

0.00007 

plane (z = 10) 

6.41960 

0.97373 

0.00439 

0.00007 

proposed 

0.37712 

0.74363 

0.00439 

0.00007 


Independence of the Initialisation. In our third experiment we analyse the de¬ 
pendency of our approach on the initialisation. To this end, we use the Stanford 
Bunny (z € [1,2]) and compare our initialisation on the coarsest scale of the pro¬ 
posed coarse-to-fine scheme (cf. Eq. ( |30l )) with two other initialisations based on 
plain surfaces (z = 1, z = 10). The initial error and the outcome after n = 10 6 itera¬ 
tions are listed in Table [3] While the initial error for a good guess (z = 1 ) and a poor 
initialisation (z = 10) differs significantly, the quality of the reconstruction and the 
reprojection is identical after sufficiently many iterations. This also holds for our ini¬ 
tialisation which can be computed from the input image without requiring a specific 
knowledge of the depth. That all initialisations converge to the same solution, how¬ 
ever, is not surprising since the estimation is embedded in our coarse-to-fine scheme. 

Comparison of Numerical Schemes. In our fourth experiment we compare the 
different numerical schemes proposed in Section [4] the full explicit scheme, the 
simplified explicit scheme and the alternating explicit scheme. In the first part of the 
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experiment we juxtapose the quality of the different numerical schemes for equal 
stopping times (iterations x time step size). As one can see from the results in Table 
[4] the full explicit scheme clearly gives the best results in terms of reconstruction 
quality and reprojection accuracy. However, this comes at the expense of a signifi¬ 
cantly larger runtime, since more iterations are needed due to the time step restric¬ 
tions discussed in Section [4] In fact the runtime is up to four orders of magnitude 
larger making the approach hardly feasible for larger image sizes. In the second 
part of the experiment we compared the numerical schemes for an equal number 
of iterations. From the results in Table [5] it becomes evident that in this case the 
simplified explicit scheme and in particular the alternating explicit scheme perform 
best in most cases in terms of reconstruction quality and reprojection accuracy. This 
demonstrates that it can be worthwhile to (partly) omit the terms that are added in 
the full explicit scheme since they slow down the convergence, but doing so does 
not necessarily compromise the quality. 


Table 4 Comparison of different numerical schemes for equal stopping time t = nx T. Results and 
runtimes refer to smaller versions of the four test images. Same parameters as in Fig. 0 and Fig. |3 
except for n, which is given by n = t/x. 



alternating scheme 

simplified scheme 

full scheme 

test image 

RSE RIE 

RSE RIE 

RSE RIE 

Small Sombrero 

(128 x 128) 

0.01823 0.01920 
(runtime: 30s) 

0.01820 0.02048 

(runtime: 15 s) 

0.00785 0.00527 

(runtime: 178021s) 

Small Stanford Bunny 
(128 x 128) 

0.00659 0.00151 
(runtime: 303s) 

0.00667 0.00257 

(runtime: 150s) 

0.00576 0.00097 

(runtime: 4278s) 

Small Dragon 
(128 x 128) 

0.01667 0.00267 

(runtime: 308s) 

0.01673 0.00620 

(runtime: 149s) 

0.01526 0.00205 

(runtime: 4304s) 

Small Suzanne 

(128x96) 

0.00899 0.00514 

(runtime: 223s) 

0.01055 0.01909 
(runtime: Ills) 

0.01022 0.00203 

(runtime: 2384s) 


Reconstruction with Inpainting. In our fifth experiment we demonstrate the in¬ 
painting capabilities of the regularisation in combination with the confidence func¬ 
tion c embedded in the data term. For this reason we created a pair of degraded 
Stanford Bunny test images together with the corresponding confidence functions, 
which are both depicted in Fig. [7] In addition, the computed depth values and the 
reprojected images are shown. One can see that in both cases the missing regions in 
the input image can hardly deteriorate the quality of the results since the smooth¬ 
ness term fills in the information from the neighbourhood. This is also reflected in 
the error measures given in Table [6] In case of the perforated version the surface 
error even remains the same compared to the result for the original version. 
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Table 5 Comparison of different numerical schemes for equal number of iterations. Results refer 
to the smaller versions of the four test test images, see Tab. [4] The same parameters as in Fig. [3] 
and Fig.|4]have been used except for n, which is given by n = 10 7 . 



alternating scheme 

simplified scheme 

full scheme 

test image 

RSE 

RIE 

RSE 

RIE 

RSE 

RIE 

Small Sombrero 

0.02357 

0.00082 

0.02392 

0.00659 

0.00358 

0.00319 

Small Stanford Bunny 

0.00390 

0.00001 

0.00378 

0.00004 

0.00489 

0.00047 

Small Dragon 

0.00572 

0.00001 

0.00562 

0.00001 

0.00964 

0.00170 

Small Suzanne 

0.00319 

0.00002 

0.00320 

0.00001 

0.00505 

0.00056 


Table 6 Evaluation of inpainting properties for degraded versions of the Stanford Bunny test im¬ 
age. Same parameters as in Figure^] 



perforated version 
(Fig. [7] top row) 

sliced version 
(Fig.^J bottom row) 

original version 
(Fig-|2| 

RSE 

0.00439 

0.00509 

0.00439 

RIE 

0.00039 

0.00249 

0.00007 


Comparison with a PDE-Based Approach. In our seventh experiment we com¬ 
pare the results of our variational method with the PDE-based approach of Vogel et 
al. EH with Lambertian reflectance model. This essentially comes down to a com¬ 
parison to the baseline method of Prados et al. |[39l which is solved by Vogel et al. 
1501 as part of a Phong-based model using an efficient fast marching scheme P5I . 
In this experiment we consider two scenarios, that nicely demonstrate the advan¬ 
tages and shortcomings of the different types of methods: On the one hand, we use 
input images without noise, on the other hand, we added Gaussian noise of standard 
deviation a = 20 before applying the two methods. The corresponding results are 
summarised in Tables [7] and [8] respectively. For the test images without noise both 
approaches give excellent results with errors among or below 1 percent of the so¬ 
lution. Thereby the approach of Vogel et al. gives slightly better results in terms of 
the relative surface error (RSE), while the variational approach gives better results 
in terms of the relative image error (RIE). From the viewpoint of the variational 
approach this can be explained as follows: While the data term penalises deviations 
from the photometric reprojection error and thus gives rather small RIE values, the 
regulariser and the coarse-to-fine scheme yield a moderate smoothing of the surface 
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Fig. 7 First row, from left to right: Perforated version of the Stanford Bunny test image, cor¬ 
responding confidence function c, computed depth values, reprojected image (a = 7.5 x 10~ 5 , 
T = 10~ 3 , n = 10 6 ). Second row: Ditto for the sliced version (same parameters). 



Fig. 8 From left to right: Noisy version of the Stanford Bunny (Gaussian noise with a = 20), 
ground truth depth, computed depth using our variational approach (a = 1.0, Z = 10~ 5 , n = 10 6 ), 
computed depth using the PDE-based approach of Vogel et al. ESI with Lambertian model. 


resulting in slightly higher RSE values. In the case of the noisy input images the 
findings are completely different. Here, the variational method can take advantage 
of both the regulariser and the independence of the initialisation. While a higher 
smoothness weight allows to obtain a smooth surface, the hierarchical initialisation 
via the coarse-to-hne scheme does not require to rely on noisy solutions at critical 
points as the PDE-based approach of Vogel et al. As a consequence, the resulting 
surface errors of 3 to 6 percent for our variational approach are significantly lower 
than those of the PDE-based model (11 to 20 percent). This can also be seen from 
the depth estimates for the Stanford Bunny depicted in Figure [8] Not surprisingly 
our findings are in full accordance with the observation in (29), in which the robust¬ 
ness of variational methods for perspective SfS has been investigated. 

Results on Real-World Images. Finally, in order to evaluate our approach on real- 
world images, we used two images of faces provided by Prados (40l . According 
to Prados, these images have been taken with a cheap digital camera in a dark 
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Table 7 Comparison between our variational method and the PDE-based approach of Vogel et al. 
ED with Lambertian reflectance model (= baseline model of Prados et al. 1391). Error measures 
are given in terms of the relative surface error (RSE) and the relative image error (RIE). Same 
parameters as in Fig. 0and Fig. [4] 



Vogel et al. l50l 
(PDE-based approach) 

our method 
(variational method) 

RSE 

RIE 

RSE 

RIE 

Sombrero 

0.00301 

0.00495 

0.00318 

0.00209 

Stanford Bunny 

0.00266 

0.00154 

0.00439 

0.00007 

Dragon 

0.00422 

0.00255 

0.01376 

0.00028 

Suzanne 

0.00253 

0.00082 

0.00251 

0.00002 


Table 8 Performance under noise. Comparison between our variational method and the PDE-based 
approach of Vogel et al. ED with Lambertian reflectance model (= baseline model of Prados et 
al. [391). Gaussian noise of standard deviation a = 20. Error measures are given in terms of the 
relative surface error (RSE) and the relative image error (RIE). The applied parameters are as 
follows: Sombrero (a = 0.1, T = 10 5 , n = 10 6 ), Stanford Bunny (a = 1.0, T = 10 5 , n = 10 6 ), 
Dragon (a = 1.0, t = 10 5 , n = 10 6 ), Suzanne (a = 1.0, t = 5x 10~ 6 , n = 10 6 ). 



Vogel et al. 1501 
(PDE-based approach) 

Our method, 
(variational method) 

RSE 

RIE 

RSE 

RIE 

Noisy Sombrero 

0.19530 

0.27254 

0.05118 

0.13239 

Noisy Stanford Bunny 

0.10973 

0.17347 

0.03235 

0.15279 

Noisy Dragon 

0.12240 

0.19409 

0.05395 

0.18767 

Noisy Suzanne 

0.12134 

0.16783 

0.01256 

0.14302 


place, where the scene is illuminated by the flash of the camera. The focal length is 
f = 5.8mm and the grid size is approximately h x = h y = 0.018mm. The test images 
as well as additional images rendered from a new viewpoint using the computed 
depth are shown in Fig. [9] In both cases the results look quite realistic. One can also 
see how the depth values at the eyes have been inpainted in the reconstruction, since 
a manually defined confidence function was used to mask out those regions where 
the assumption of a Lambertian surface is violated. 
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Fig. 9 First row, from left to right: Face with closed eyes, three images rendered from a new 
viewpoint using the estimated depth ( a = 7.5 x 10~ 5 , t = 5x 10~ 3 , n = 2 x 10 s ). Second row: 
Ditto for the second test image (a = 7.5 x 10~ 5 , t = 5x 10~ 3 , n = 2 x 10 5 ). 


7 Conclusion 

In this paper, we described a novel variational model for perspective shape from 
shading that not only has many desirable theoretical properties but also yields very 
convincing reconstruction results for synthetic and real-world input images, even in 
the presence of noise or other deteriorations in an input image. While the arising 
optimisation problem has turned out to be challenging, we have proposed an alter¬ 
nating explicit scheme embedded in a coarse-to-fine framework that is robust with 
respect to the initialisation and that allows reasonable computation times compared 
to a standard explicit scheme. 

Besides the results that are documented via extensive experiments in this chapter, 
let us point out that we see a main contribution of our work in a different context, as 
we have layed the fundamental building block for a conceptually correct, working 
variational framework that can combine perspective shape from shading with other 
techniques from computer vision such as e.g. stereo vision. We aim to explore the 
arising possibilities in a future work. 
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8 Appendix 


Alternative Derivation of the Surface Normal. Instead of computing the deriva¬ 
tives with respect to the 2-D image coordinates x and y, one can also derive the 
surface normal in an alternative way that is often used in the literature, see e.g. 
f52l . The idea is to interpret the original surface in Eq. ([4]) as a function of the 3-D 
coordinates X, Y and Z(X,Y) 



X(x,z) 


- zx - 

~f~ 

■SZ (X(x,z),F(x,z),Z(X(x,z),F(x,z))) = 

r(x,z) 

_Z(X(x,z),Y(x,z))_ 


li 

1_ 


Dropping the dependency of X, Y and Z(X,Y) on x, z and computing the partial 
derivatives with respect to X and Y via the chain rule 

dX _ dX dx dY _ dY dy 

'dY^lhdY' ~dX~~d^~dX 

then gives the tangent vectors to the surface 


^x(x,z) 


1 


f l 

ZxY 


z + z y y 

Z + Z x X 

, ^r(x,z) = 

1 

Zx f 


Hh 

Z + Z x X _ 


. z+z y y_ 


(45) 


After some computations we finally obtain the corresponding normal direction 

n(x) = y x (x,z)xy y (x,z) = —--rn(x). (46) 

(■ z+z x x){z+zyy ) 

where n(x) is the normal direction from Eq. ([8]) As expected, both vectors only dif¬ 
fer by scale, i.e. they have the same direction. Hence, the corresponding normalised 
vectors n/|n| and n/|n| are identical. While this alternative derivation was not used 
in our paper, it helps to clarify a common mistake in the literature that will be ex¬ 
plained in the following. 

Remark. Please note that, unlike in the orthographic case, the cross derivatives 
dX/dY and dY/dX do not vanish for the perspective model. Hence, using the or¬ 
thographic derivation of the normal direction from Horn and Brooks li27l 
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with zero cross derivatives and simply replacing the remaining partial derivatives 
Zx and Z Y by the corresponding expressions from ( |45| > is not completely correct for 
the perspective case. Such an approach has for instance been proposed in Il52ll55l . It 
actually mixes the orthographic and the perspective model and thus typically gives 
worse results in the case of strong perspective distortions. Moreover, apart from 
not being completely correct, this strategy also yields significantly more complex 
models that typically require auxiliary variables to be solved, see again e.g. Il52ll55ll . 
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